Bilingual Words and Phrase Mappings for Marathi and Hindi SMT

نویسندگان

  • Sreelekha S
  • Pushpak Bhattacharyya
چکیده

Lack of proper linguistic resources is the major challenges faced by the Machine Translation system developments when dealing with the resource poor languages. In this paper, we describe effective ways to utilize the lexical resources to improve the quality of statistical machine translation. Our research on the usage of lexical resources mainly focused on two ways, such as; augmenting the parallel corpus with more vocabulary and to provide various word forms. We have augmented the training corpus with various lexical resources such as lexical words, function words, kridanta pairs and verb phrases. We have described the case studies, evaluations and detailed error analysis for both Marathi to Hindi and Hindi to Marathi machine translation systems. From the evaluations we observed that, there is an incremental growth in the quality of machine translation as the usage of various lexical resources increases. Moreover, usage of various lexical resources helps to improve the coverage and quality of machine translation where limited parallel corpus is available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The IIT Bombay SMT System for ICON 2014 Tools Contest

In this paper, we describe our submission to the ICON 2014 Tools Contest for Machine Translation. The source languages are English, Marathi, Tamil, Telugu, Bengali and the target language is Hindi. We submitted 15 systems; 5 each for the tourism, health and general domains. Our submission is a Phrase-based Statistical Machine Translation system with preprocessing and post-processing elements. A...

متن کامل

Comparison of SMT and RBMT, The Requirement of Hybridization for Marathi – Hindi MT

We present in this paper our work on comparison between Statistical Machine Translation (SMT) and Rule-based machine translation for translation from Marathi to Hindi. Rule Based systems although robust take lots of time to build. On the other hand statistical machine translation systems are easier to create, maintain and improve upon. We describe the development of a basic Marathi-Hindi SMT sy...

متن کامل

Comparison of SMT and RBMT; The Requirement of Hybridization for Marathi-Hindi MT

We present in this paper our work on comparison between Statistical Machine Translation (SMT) and Rule-based machine translation for translation from Marathi to Hindi. Rule Based systems although robust take lots of time to build. On the other hand statistical machine translation systems are easier to create, maintain and improve upon. We describe the development of a basic Marathi-Hindi SMT sy...

متن کامل

Hindi and Marathi to English Cross Language Information Retrieval at CLEF 2007

In this paper, we present our Hindi ->English and Marathi ->English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the dictionary are transliterated using a simple lookup table based transliteration approach. The resultant transliteration is then compar...

متن کامل

Hindi and Marathi to English Cross Language Information

In this paper, we present our Hindi ->English and Marathi ->English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the dictionary are transliterated using a simple lookup table based transliteration approach. The resultant transliteration is then compar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.02398  شماره 

صفحات  -

تاریخ انتشار 2017